Hindi Compound Verbs and their Automatic Extraction

نویسندگان

  • Debasri Chakrabarti
  • Hemang Mandalia
  • Ritwik Priya
  • Vaijayanthi M. Sarma
  • Pushpak Bhattacharyya
چکیده

We analyse Hindi complex predicates and propose linguistic tests for their detection. This analysis enables us to identify a category of V+V complex predicates called lexical compound verbs (LCpdVs) which need to be stored in the dictionary. Based on the linguistic analysis, a simple automatic method has been devised for extracting LCpdVs from corpora. We achieve an accuracy of around 98% in this task. The LCpdVs thus extracted may be used to automatically augment lexical resources like wordnets, an otherwise time consuming and labourintensive process

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Classification of Hindi Verbs in Syntactic Perspective

We report of a rule based, knowledge-base driven tool to automatically classify Hindi verbs in syntactic perspective. We also report of developing the largest lexical resource for Hindi verbs along with the information on their class based on valency and some syntactic diagnostic tests as well as their morphological/inflectional type. We use this resource to develop the tool to automatically cl...

متن کامل

Automatic Extraction of Complex Predicates in Bengali

This paper presents the automatic extraction of Complex Predicates (CPs) in Bengali with a special focus on compound verbs (Verb + Verb) and conjunct verbs (Noun /Adjective + Verb). The lexical patterns of compound and conjunct verbs are extracted based on the information of shallow morphology and available seed lists of verbs. Lexical scopes of compound and conjunct verbs in consecutive sequen...

متن کامل

Corpus based Semi-Automatic Extraction of Persian Compound Verbs and their Relations

Nowadays, Wordnet is used in natural language processing as one of the major linguistic resources. Having such a resource for Persian language helps researchers in computational linguistics and natural language processing fields to develop more accurate systems with higher performances. In this research, we propose a model for semi-automatic construction of Persian wordnet of verbs. Compound ve...

متن کامل

Automatic Generation of Compound Word Lexicon for Hindi Speech Synthesis

This paper addresses the problem of Hindi compound word splitting and its relevance to developing a good quality phonetizer for Hindi Speech Synthesis. The constituents of a Hindi compound word are not separated by space or hyphen. Hence, most of the existing compound splitting algorithms can not be applied to Hindi. We propose a new technique for automatic extraction of compound words from Hin...

متن کامل

Creation of English and Hindi Verb Hierarchies and their Application to Hindi WordNet Building and English-Hindi MT

Verbs form the pivots of sentences. However, they have not received as much attention as nouns did in the ontology and lexical semantics research. The classification of verbs and placing them in a structure according to their selectional preference and other semantic properties seem essential in most text information processing tasks like machine translation, information extraction etc. The pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008